Skip to content

[RFC] Add ChannelMonitor::get_justice_txs for simplified watchtower integration#4453

Open
FreeOnlineUser wants to merge 1 commit intolightningdevkit:mainfrom
FreeOnlineUser:watchtower-justice-api
Open

[RFC] Add ChannelMonitor::get_justice_txs for simplified watchtower integration#4453
FreeOnlineUser wants to merge 1 commit intolightningdevkit:mainfrom
FreeOnlineUser:watchtower-justice-api

Conversation

@FreeOnlineUser
Copy link
Copy Markdown

Draft PR for design feedback. Implements the approach @TheBlueMatt suggested in lightningdevkit/ldk-node#813 and in review of #2552: move justice tx state tracking inside ChannelMonitor so Persist implementors don't need external queues.

What this does

Adds a single method that replaces the current 3-step dance of counterparty_commitment_txs_from_update() + manual queue + sign_to_local_justice_tx():

pub struct JusticeTransaction {
    pub tx: Transaction,
    pub revoked_commitment_txid: Txid,
    pub commitment_number: u64,
}

impl ChannelMonitor {
    pub fn get_justice_txs(
        &self,
        feerate_per_kw: u64,
        destination_script: ScriptBuf,
    ) -> Vec<JusticeTransaction>;
}

Internally, ChannelMonitorImpl stores recent counterparty CommitmentTransactions (populated during update application, pruned to entries within one revocation of current). get_justice_txs checks which have revocation secrets available, builds and signs the justice tx, and returns the result.

Changes

  • channelmonitor.rs: new JusticeTransaction struct, latest_counterparty_commitment_txs field, TLV 39 serialization (optional, backwards-compatible), get_justice_txs() method
  • test_utils.rs: simplified WatchtowerPersister (removed JusticeTxData, unsigned_justice_tx_data queue, form_justice_data_from_commitment)
  • 127 additions, 83 deletions. All existing tests pass.

Splice handling

Pruning uses commitment numbers, not entry count. During a splice, multiple entries share the same commitment number (one per funding scope) and all are retained.

Backwards compatibility

  • TLV 39 is optional. Old monitors load with empty vec, populated on next update.
  • Old nodes reading monitors written by new code skip the unknown TLV field.
  • Existing counterparty_commitment_txs_from_update() and sign_to_local_justice_tx() APIs unchanged.

Design questions

Looking for input on these before moving out of draft:

  1. Signed vs unsigned return? Currently returns fully signed transactions. Returning unsigned gives callers more flexibility on fee choice at broadcast time. Preference?

  2. HTLC outputs? This only covers to_local justice. HTLC-timeout and HTLC-success outputs on revoked commitments are not included. Worth adding here, or separate follow-up?

  3. Feerate source? Caller provides feerate_per_kw. An alternative would be accepting a fee estimator. Any preference?

  4. Dust filtering? If the revokeable output is dust, revokeable_output_index() returns None and the entry is skipped silently. Right behavior, or surface this to the caller?

@ldk-reviews-bot
Copy link
Copy Markdown

ldk-reviews-bot commented Mar 1, 2026

👋 I see @valentinewallace was un-assigned.
If you'd like another reviewer assignment, please click here.

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 1, 2026

Codecov Report

❌ Patch coverage is 95.31250% with 9 lines in your changes missing coverage. Please review.
✅ Project coverage is 87.09%. Comparing base (db42ad6) to head (92e35d2).
⚠️ Report is 8 commits behind head on main.

Files with missing lines Patch % Lines
lightning/src/chain/channelmonitor.rs 94.88% 7 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4453      +/-   ##
==========================================
- Coverage   87.10%   87.09%   -0.01%     
==========================================
  Files         163      163              
  Lines      108765   108977     +212     
  Branches   108765   108977     +212     
==========================================
+ Hits        94743    94917     +174     
- Misses      11535    11570      +35     
- Partials     2487     2490       +3     
Flag Coverage Δ
fuzzing 40.17% <19.31%> (-0.02%) ⬇️
tests 86.21% <95.31%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

/// Contains the current and previous counterparty commitment(s). With splicing,
/// there may be multiple entries per commitment number (one per funding scope).
/// Pruned to remove entries more than one revocation old.
latest_counterparty_commitment_txs: Vec<CommitmentTransaction>,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be in the funding scope to support splicing. Also should probably just be cur_... and prev_... to match existing API pattern rather than a vec that is max-size 2.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to FundingScope as cur/prev fields. TLV 13/15 on FundingScope, 39/41 on the outer scope for backwards compat.

Err(_) => break,
}
}
// Use the simplified get_justice_txs API
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude loves to leave comments about how it changed the code that are totally useless - reading the code I don't care what changes claude made in the past, I care if there's something useful to know now. Probably can just drop this.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed, and noted for future.

///
/// Returns a list of [`JusticeTransaction`]s, each containing a fully signed
/// transaction and metadata about the revoked commitment it punishes.
pub fn get_justice_txs(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This API doesn't really make sense. There's no information on when I should call this or when the monitor "knows about" a revoked tx. We probably want to do soemthing like the old API where you can fetch revoked transactions in relation to a specific monitor update.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced with sign_initial_justice_tx() for persist_new_channel and sign_justice_txs_from_update(update) for update_persisted_channel. Each is tied to a specific point in the persistence pipeline.

@FreeOnlineUser FreeOnlineUser force-pushed the watchtower-justice-api branch 2 times, most recently from 9af37e1 to 034c377 Compare March 2, 2026 23:38
Copy link
Copy Markdown
Collaborator

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I think this basically looks good!

let mut result = Vec::new();
for commitment_tx in &to_sign {
if let Some(jtx) =
self.try_sign_justice_tx(commitment_tx, feerate_per_kw, destination_script.clone())
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also build justice transactions for all the potential HTLC transactions? I assume we can...

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added. try_sign_justice_txs now iterates nondust_htlcs() on the revoked commitment, builds a spending tx for each HTLC output, and signs with sign_justice_revoked_htlc. Witness follows the same pattern as RevokedHTLCOutput in package.rs. Dust HTLCs are skipped when fee exceeds value.

Also added test_justice_tx_htlc_from_monitor_updates which routes a payment, captures the commitment with a pending HTLC, revokes it, and verifies the watchtower gets justice txs for both to_local and the HTLC output.

funding.prev_counterparty_commitment_tx =
funding.cur_counterparty_commitment_tx.take();
funding.cur_counterparty_commitment_tx = Some(commitment_tx);
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume we can/should do this?

Suggested change
}
} else {
debug_assert!(false);
}

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

FreeOnlineUser added a commit to FreeOnlineUser/rust-lightning that referenced this pull request Mar 10, 2026
Extends try_sign_justice_txs to also sweep HTLC outputs when a
counterparty broadcasts a revoked commitment. For each non-dust HTLC,
builds a spending transaction and signs it with sign_justice_revoked_htlc.

Updates WatchtowerPersister to store multiple justice txs per revoked
commitment (Vec<Transaction> instead of single Transaction).

Addresses review feedback from TheBlueMatt on PR lightningdevkit#4453.
Copy link
Copy Markdown
Author

@FreeOnlineUser FreeOnlineUser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated with both changes. API is now sign_initial_justice_txs / try_sign_justice_txs (plural) since they return a Vec covering to_local + HTLCs. WatchtowerPersister stores Vec<Transaction> per revoked txid accordingly.

@FreeOnlineUser FreeOnlineUser marked this pull request as ready for review March 10, 2026 10:56
@FreeOnlineUser FreeOnlineUser marked this pull request as draft March 10, 2026 11:01
@joostjager joostjager removed their request for review March 17, 2026 10:59
@FreeOnlineUser FreeOnlineUser marked this pull request as ready for review March 18, 2026 13:09
FreeOnlineUser added a commit to FreeOnlineUser/rust-lightning that referenced this pull request Mar 19, 2026
Extends try_sign_justice_txs to also sweep HTLC outputs when a
counterparty broadcasts a revoked commitment. For each non-dust HTLC,
builds a spending transaction and signs it with sign_justice_revoked_htlc.

Updates WatchtowerPersister to store multiple justice txs per revoked
commitment (Vec<Transaction> instead of single Transaction).

Addresses review feedback from TheBlueMatt on PR lightningdevkit#4453.
@FreeOnlineUser FreeOnlineUser force-pushed the watchtower-justice-api branch from 7455283 to e3976a0 Compare March 19, 2026 08:17
Comment on lines +4660 to +4668
let mut to_sign = Vec::new();
for funding in core::iter::once(&mut self.funding).chain(self.pending_funding.iter_mut()) {
let should_take =
funding.prev_counterparty_commitment_tx.as_ref().is_some_and(|prev| {
self.commitment_secrets.get_secret(prev.commitment_number()).is_some()
});
if should_take {
to_sign.push(funding.prev_counterparty_commitment_tx.take().unwrap());
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: sign_justice_txs_from_update mutates the monitor's in-memory state via .take() on prev_counterparty_commitment_tx. Since this is called inside a Persist callback, if the persist operation fails and is retried, the prev commitment has already been removed from memory — the retry will produce no justice txs.

The mutation is committed immediately (interior mutability via Mutex), but the serialized/persisted state may not reflect it if the persist fails. On a process restart, the old persisted state would have the prev, but during the same process lifetime, the data is gone.

Consider either:

  • Making this idempotent (don't take, just check and sign — dedup on the caller side)
  • Or cloning instead of taking, and only clearing after confirmed persist

Comment on lines +4643 to +4656
// Store new commitments, rotating cur -> prev per funding scope.
for commitment_tx in new_commitment_txs {
let txid = commitment_tx.trust().built_transaction().txid;
let funding = core::iter::once(&mut self.funding)
.chain(self.pending_funding.iter_mut())
.find(|f| f.current_counterparty_commitment_txid == Some(txid));
if let Some(funding) = funding {
funding.prev_counterparty_commitment_tx =
funding.cur_counterparty_commitment_tx.take();
funding.cur_counterparty_commitment_tx = Some(commitment_tx);
} else {
debug_assert!(false);
}
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rotation prev = cur.take(); cur = new; happens before checking whether the old prev had a revocation secret available. If the caller skips calling sign_justice_txs_from_update for an update that only contains a revocation (no new commitment), and then calls it for the next update that does contain a new commitment, the old prev is overwritten without ever being signed.

This is safe under the assumption that the caller invokes this for every ChannelMonitorUpdate, but the doc comment doesn't state this requirement. The method should either:

  1. Document that it MUST be called for every update (not just commitment-bearing ones), or
  2. Check the old prev for revocation secrets before the rotation, sign it if ready, and then rotate.

Some(tx) => tx,
None => return Vec::new(),
};
self.funding.cur_counterparty_commitment_tx = Some(commitment_tx.clone());
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This unconditionally overwrites cur_counterparty_commitment_tx with the initial commitment. Since provide_initial_counterparty_commitment_tx (line 3539) already sets this field, this is redundant in the normal path.

More importantly, since sign_initial_justice_txs is a pub method, a caller who mistakenly invokes it after updates have been applied would reset cur_counterparty_commitment_tx back to the initial commitment, silently corrupting the rotation state for sign_justice_txs_from_update. Consider guarding this (e.g., only set if cur_counterparty_commitment_tx.is_none()).

funding.cur_counterparty_commitment_tx.take();
funding.cur_counterparty_commitment_tx = Some(commitment_tx);
} else {
debug_assert!(false);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

debug_assert!(false) only fires in debug builds. In release, the new commitment is silently dropped, meaning the watchtower will miss this commitment entirely (and any subsequent ones for this funding scope). At minimum, this should log a warning. Consider log_error! or returning an error.

Comment on lines +4743 to +4746
let fee = Amount::from_sat(crate::chain::chaininterface::fee_for_weight(feerate_per_kw as u32,
// Base tx weight + witness weight
Transaction { version: Version::TWO, lock_time: LockTime::ZERO, input: input.clone(), output: vec![TxOut { script_pubkey: destination_script.clone(), value: htlc_value }] }.weight().to_wu() + weight_estimate
));
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: This constructs a full Transaction object (cloning input and destination_script) purely for weight estimation. The existing build_to_local_justice_tx does this too, but here it happens in a loop for every HTLC. Consider computing the base weight once outside the loop since it's the same for all HTLC justice txs (single input, single output to the same destination_script).

Comment on lines +1216 to +1217
(13, cur_counterparty_commitment_tx, option),
(15, prev_counterparty_commitment_tx, option),
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These fields are added to FundingScope's impl_writeable_tlv_based! (at TLV 13/15, which applies to pending_funding entries), and to the top-level write_chanmon_internal (at TLV 39/41, for the main funding scope). During deserialization, the main funding scope reads from TLV 39/41 while pending_funding entries read from their per-FundingScope TLV 13/15.

This split serialization works but is subtle and fragile — a future change that serializes the main funding as a whole FundingScope would silently double-write the data. Worth adding a comment explaining why these live in two places.

Comment on lines 705 to 761
@@ -722,24 +705,19 @@ impl WatchtowerPersister {
.get(&channel_id)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old justice_tx method returned the single justice tx directly. Now it returns only the first element from the vec via .first(). This works because try_sign_justice_txs pushes the to_local justice tx first, but this ordering is an implicit invariant. If the order in try_sign_justice_txs ever changes, existing callers of justice_tx would silently get the wrong transaction. Consider adding a comment documenting this assumption, or filtering by output type.

@ldk-claude-review-bot
Copy link
Copy Markdown
Collaborator

ldk-claude-review-bot commented Mar 19, 2026

I've thoroughly re-examined every file and hunk in the diff against my prior review findings and the codebase. The prior review was comprehensive; the HTLC witness construction, signer access patterns, feerate handling, and TLV serialization all correctly follow existing codebase patterns. No new issues beyond those already posted.

Review Summary

All prior inline comments remain valid and unresolved. No new issues found in this pass.

Prior inline comments (still valid, not re-posted)

Security-critical bugs:

  • channelmonitor.rs:3634 — False invariant: CommitmentSecret and commitment steps CAN be batched in a single ChannelMonitorUpdate (via revoke_and_ack freeing holding cell), causing permanent loss of revoked commitment justice txs
  • channelmonitor.rs:3641 — Sticky guard in pending_funding permanently blocks rotation after first revocation; only the very first revoked commitment ever produces justice txs for pending_funding scopes
  • channelmonitor.rs:4894 — Rotation and signing ordered incorrectly for batched updates containing both commitment and secret steps

Bugs:

  • channelmonitor.rs:2329 — Doc claim of idempotency is false; .take() in rotation means retry produces different results
  • channelmonitor.rs:4654debug_assert!(false) silently drops commitment in release builds
  • channelmonitor.rs:4668.take() mutates in-memory state during Persist callback; failed persist + retry loses data
  • test_utils.rs:802-804 — Duplicate accumulation: idempotent get_pending_justice_txs + Vec::push causes duplicates across consecutive watchtower-relevant updates

Code quality / correctness:

  • channelmonitor.rs:1231 — Split serialization (TLV 13/15 vs 43/45) needs explanatory comment
  • channelmonitor.rs:4377 — Comment scoped too narrowly; post-loop also handles RenegotiatedFunding
  • channelmonitor.rs:4388 — Inconsistent indentation
  • channelmonitor.rs:4633sign_initial_justice_txs unconditionally overwrites cur without guard
  • channelmonitor.rs:4656 — Rotation before secret check; doc doesn't state caller must invoke for every update
  • channelmonitor.rs:4953 — Per-HTLC weight estimation constructs throwaway Transaction in a loop
  • test_utils.rs:761justice_tx() returns .first() with implicit ordering assumption
  • functional_tests.rs:32 — Unused import INITIAL_COMMITMENT_NUMBER
  • functional_tests.rs:1140 — Duplicate assert!(!pending.is_empty())
  • functional_tests.rs:1141 — "crash_recovery" test doesn't serialize/deserialize
  • functional_tests.rs:1143 — Weak assertion (commitment_number < INITIAL_COMMITMENT_NUMBER is trivially true)

Cross-cutting concerns

  1. No tests exercise the batched revoke_and_ack scenario — the most critical security bug has zero test coverage.
  2. No tests exercise the LatestCounterpartyCommitment path — all tests use test_legacy_channel_config(), so the sticky guard bug in pending_funding is untested.
  3. No serialization round-trip test — TLV 43/45 persistence (the actual crash recovery invariant) is never tested.
  4. Storage overhead for all users — every ChannelMonitor unconditionally stores two CommitmentTransaction objects per funding scope, even for users not using the watchtower API.
  5. PR description says "TLV 39 serialization" — code actually uses TLV 43/45; TLV 39 is best_block.previous_blocks.

/// Stores new counterparty commitment(s) from the update and signs any
/// previously-stored commitments whose revocation secrets are now available.
///
/// Intended to be called during [`Persist::update_persisted_channel`].
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs a lot more detail. One big concern is that we need to describe how crash-safety works. Let's say I start storing a ChannelMonitorUpdate while I'm waiting on my watchtower to persist my new transactions, but then crash before the second. I have to make sure the transactions get reloaded on restart somehow and aren't lost, even if the ChannelMonitorUpdate made it to disk (and thus wasn't replayed). Not sure if this should be handled by forcing downstream logic to persist before the monitor or if we should have some way to do a replay.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, the update-already-persisted case means replay alone isn't enough.

I think the cleanest fix is: keep the prev/cur commitment data in the monitor (clone instead of .take()), serialize it, and add a get_pending_justice_txs() method that reads the stored state without mutating it. On restart, the Persist impl calls this against the loaded monitor to recover any justice txs it didn't finish sending.

The full crash-safety model would be:

  • Normal path: Persist impl gets justice txs from the update, sends to watchtower, returns Completed
  • Crash recovery: Persist impl calls get_pending_justice_txs() on the loaded monitor at startup, re-sends anything the watchtower doesn't have
  • The watchtower deduplicates on its end (or the Persist impl tracks what's been acked)

This makes the API idempotent and crash-safe without requiring any ordering constraints on when the monitor update is persisted. I'll rework the implementation and expand the docs.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that seems reasonable. My question then is do we even need sign_justice_txs_from_update? Should we instead drop the from-update API and replace it with a simple bool on a ChannelMonitorUpdate to check if there is a commitment update that may need to be written to a watchtower?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Dropped both sign_justice_txs_from_update and sign_initial_justice_txs. The API is now just get_pending_justice_txs() plus ChannelMonitorUpdate::updates_watchtower_state() to signal when to call it. Rotation happens in the normal update path.

@ldk-reviews-bot
Copy link
Copy Markdown

🔔 1st Reminder

Hey @valentinewallace! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot
Copy link
Copy Markdown

🔔 2nd Reminder

Hey @valentinewallace! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@valentinewallace
Copy link
Copy Markdown
Contributor

Feel free to re-request when feedback is addressed!

@valentinewallace valentinewallace removed their request for review March 25, 2026 14:50
@FreeOnlineUser FreeOnlineUser force-pushed the watchtower-justice-api branch from e3976a0 to 0c79167 Compare April 2, 2026 10:46
Comment on lines +2327 to +2329
/// The previous commitment data is cloned rather than consumed, so calling
/// this multiple times for the same update produces the same justice
/// transactions. The data also survives serialization, enabling recovery
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: This doc claim is false — the method is not idempotent within a single process lifetime.

While the signing phase clones prev (line 4914), the rotation phase at lines 4888-4889 still uses cur_counterparty_commitment_tx.take(), which consumes cur. On a second call with the same update:

  1. counterparty_commitment_txs_from_update re-extracts the same new commitment tx
  2. The funding scope match succeeds (same txid)
  3. prev = cur.take() — but cur now holds the new commitment (set by the first call), so prev gets overwritten with the new commitment, discarding the old revoked commitment
  4. Secret matching against prev fails (new commitment number ≠ revocation index)
  5. No justice transactions are produced

This matters if a Persist implementation retries on transient failure (e.g., network error to a remote store) without restarting the process. The watchtower would silently miss revoked commitments.

Suggest either fixing the rotation to not use .take() (e.g., clone cur into prev), or correcting this doc to state that the method must only be called once per update per process lifetime.

Comment on lines +4882 to +4894
for commitment_tx in new_commitment_txs {
let txid = commitment_tx.trust().built_transaction().txid;
let funding = core::iter::once(&mut self.funding)
.chain(self.pending_funding.iter_mut())
.find(|f| f.current_counterparty_commitment_txid == Some(txid));
if let Some(funding) = funding {
funding.prev_counterparty_commitment_tx =
funding.cur_counterparty_commitment_tx.take();
funding.cur_counterparty_commitment_tx = Some(commitment_tx);
} else {
debug_assert!(false);
}
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: The rotation and signing are ordered incorrectly for the case where a single ChannelMonitorUpdate contains both a new counterparty commitment step and a CommitmentSecret step that revokes the commitment that was in prev before the rotation.

Example: before this method runs, cur = C(n), prev = C(n+1). An update arrives containing LatestCounterpartyCommitmentTXInfo(C(n-1)) and CommitmentSecret(idx = n+1).

  1. Rotation: prev = C(n) (was cur), cur = C(n-1)C(n+1) is lost
  2. new_secret_indices = [n+1]
  3. Check prev.commitment_number() = n — doesn't match n+1
  4. Justice tx for C(n+1) is never produced

The fix is to check/sign the old prev against available secrets before performing the rotation. Whether this scenario actually occurs in current LDK channel code depends on whether commitment_signed and revoke_and_ack processing ever produce a single batched ChannelMonitorUpdate — but the method's contract doesn't prohibit it, and future code changes could introduce this pattern.

Comment on lines +1140 to +1143
// The pending justice tx corresponds to the most recently revoked commitment,
// which may differ from the one we captured earlier if multiple rotations occurred.
// Verify it spends a valid revoked commitment output.
assert!(pending[0].commitment_number < INITIAL_COMMITMENT_NUMBER);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: This assertion is extremely weak — commitment_number < INITIAL_COMMITMENT_NUMBER is true for any non-initial commitment and would pass even if get_pending_justice_txs returned a completely unrelated commitment.

Since only one rotation occurs in this test, consider asserting that pending[0].revoked_commitment_txid == revoked_txid, which would actually validate the crash recovery round-trip rather than trivially passing.

@FreeOnlineUser FreeOnlineUser force-pushed the watchtower-justice-api branch 3 times, most recently from 0e293ab to e0ebc67 Compare April 2, 2026 13:39
/// revoked commitment per funding scope. If multiple commitment rotations
/// occurred between the crash and restart (e.g., the counterparty sent
/// several updates while the process was down), earlier revoked commitments
/// will have been overwritten. In practice, this requires the counterparty
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ha, I don't think we need to document things that are impossible :)

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed :)

/// Stores new counterparty commitment(s) from the update and signs any
/// previously-stored commitments whose revocation secrets are now available.
///
/// Intended to be called during [`Persist::update_persisted_channel`].
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that seems reasonable. My question then is do we even need sign_justice_txs_from_update? Should we instead drop the from-update API and replace it with a simple bool on a ChannelMonitorUpdate to check if there is a commitment update that may need to be written to a watchtower?

self.inner.lock().unwrap().sign_to_local_justice_tx(justice_tx, input_idx, value, commitment_number)
}

/// Stores the initial counterparty commitment and returns signed justice transactions
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm? This doesn't "store" anything?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the method. Storage now happens during update_counterparty_commitment_data and the LatestCounterpartyCommitmentTXInfo processing path.

/// 2. Send them to the watchtower
/// 3. Persist the monitor update to disk
///
/// If the process crashes before step 3, the monitor update was never
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In practice we could restart and the counterparty could give us a different commitment transaction. They shouldn't, but they could. I imagine this implies we should instead only send the justice txn to the watchtower after persistence?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. The WatchtowerPersister example now calls get_pending_justice_txs after persistence completes, not before.

/// the same transactions.
///
/// If the process crashes after step 3, the update won't replay, but the
/// previous commitment data is still present in the deserialized monitor.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay but what if we lose connectivity to the watchtower but not our peer? We'd then step the channel forward twice and on restart lose the previous state. ISTM we have to tell clients to hold on to monitor updates if there's still a pending watchtower write for a previous update.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Added doc guidance that the Persist implementation should delay completing monitor updates until previously obtained justice transactions have been delivered.

@FreeOnlineUser FreeOnlineUser force-pushed the watchtower-justice-api branch 4 times, most recently from a9a7221 to a07c56d Compare April 3, 2026 01:48
Comment on lines +796 to +804
for jtx in justice_txs {
self.watchtower_state
.lock()
.unwrap()
.get_mut(&data.channel_id())
.unwrap()
.entry(jtx.revoked_commitment_txid)
.or_insert_with(Vec::new)
.push(jtx.tx);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: get_pending_justice_txs is idempotent by design — it returns the same results on repeated calls for the same monitor state. But this code uses Vec::push, which accumulates duplicates across consecutive watchtower-relevant updates that don't rotate prev.

Example: a CommitmentSecret update followed by a RenegotiatedFunding update — both trigger updates_watchtower_state(), and prev doesn't change between them, so the same justice tx(s) are pushed into the Vec twice.

The old code used HashMap::insert(commitment_txid, signed_justice_tx) which was an idempotent upsert. Consider either:

  • Using entry().or_insert() instead of push (replacing rather than appending when the txid already has entries)
  • Or clearing the Vec for this txid before pushing the new results
Suggested change
for jtx in justice_txs {
self.watchtower_state
.lock()
.unwrap()
.get_mut(&data.channel_id())
.unwrap()
.entry(jtx.revoked_commitment_txid)
.or_insert_with(Vec::new)
.push(jtx.tx);
for jtx in justice_txs {
self.watchtower_state
.lock()
.unwrap()
.get_mut(&data.channel_id())
.unwrap()
.insert(jtx.revoked_commitment_txid, vec![jtx.tx]);
}

Note: this replaces rather than appends, which is correct since get_pending_justice_txs always returns the complete set for each revoked commitment.

Comment on lines +4368 to +4369
// Populate cur/prev for the LatestCounterpartyCommitmentTXInfo path, which
// doesn't go through update_counterparty_commitment_data.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says "for the LatestCounterpartyCommitmentTXInfo path" but this post-loop is also the only mechanism that sets cur_counterparty_commitment_tx for RenegotiatedFunding scopes (created at line 4097 with cur_counterparty_commitment_tx: None).

If this code is removed during the planned LatestCounterpartyCommitmentTXInfo deprecation, RenegotiatedFunding (splice) scopes would silently lose their initial commitment tracking — cur would stay None, so after the next commitment update, prev would be None instead of the initial splice commitment, and the watchtower would miss that revocation.

Suggest updating the comment:

Suggested change
// Populate cur/prev for the LatestCounterpartyCommitmentTXInfo path, which
// doesn't go through update_counterparty_commitment_data.
// Populate cur/prev for paths that don't go through
// update_counterparty_commitment_data: LatestCounterpartyCommitmentTXInfo
// (legacy, to be removed) and RenegotiatedFunding (splice).

Comment on lines +4927 to +4941
let fee = Amount::from_sat(crate::chain::chaininterface::fee_for_weight(
feerate_per_kw as u32,
// Base tx weight + witness weight
Transaction {
version: Version::TWO,
lock_time: LockTime::ZERO,
input: input.clone(),
output: vec![TxOut {
script_pubkey: destination_script.clone(),
value: htlc_value,
}],
}
.weight()
.to_wu() + weight_estimate,
));
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fee estimation constructs a throwaway Transaction with input.clone() and destination_script.clone() per HTLC, but the base tx weight (excluding witness) is identical for every HTLC justice tx in this loop — same version, locktime, single input (empty witness), single output to the same destination script.

Consider hoisting the base weight calculation before the loop and only varying the witness weight per HTLC type:

let base_weight = Transaction {
    version: Version::TWO,
    lock_time: LockTime::ZERO,
    input: vec![TxIn { ..Default::default() }],
    output: vec![TxOut { script_pubkey: destination_script.clone(), value: Amount::ZERO }],
}.weight().to_wu();

// Then inside the loop:
let fee = fee_for_weight(feerate_per_kw as u32, base_weight + weight_estimate);

This avoids cloning input and destination_script on every iteration. (Reiterating prior comment #5 with a concrete suggestion since it's in a per-HTLC loop that could be large.)

Comment on lines +1100 to +1141
pub fn test_justice_tx_crash_recovery() {
// Verify that get_pending_justice_txs returns justice txs for the most
// recently revoked counterparty commitment, enabling crash recovery.
let chanmon_cfgs = create_chanmon_cfgs(2);
let destination_script = chanmon_cfgs[1].keys_manager.get_destination_script([0; 32]).unwrap();
let persisters = [
WatchtowerPersister::new(
chanmon_cfgs[0].keys_manager.get_destination_script([0; 32]).unwrap(),
),
WatchtowerPersister::new(destination_script.clone()),
];
let node_cfgs = create_node_cfgs_with_persisters(2, &chanmon_cfgs, persisters.iter().collect());
let legacy_cfg = test_legacy_channel_config();
let node_chanmgrs =
create_node_chanmgrs(2, &node_cfgs, &[Some(legacy_cfg.clone()), Some(legacy_cfg)]);
let nodes = create_network(2, &node_cfgs, &node_chanmgrs);

let (_, _, channel_id, _) = create_announced_chan_between_nodes(&nodes, 0, 1);

// Move to a non-initial commitment
send_payment(&nodes[0], &[&nodes[1]], 5_000_000);

// Capture the commitment that will be revoked
let revoked_local_txn = get_local_commitment_txn!(nodes[0], channel_id);
assert_eq!(revoked_local_txn.len(), 1);
let revoked_txid = revoked_local_txn[0].compute_txid();

// Revoke it
send_payment(&nodes[0], &[&nodes[1]], 5_000_000);

// The persister should have a justice tx for this revoked commitment
assert!(persisters[1].justice_tx(channel_id, &revoked_txid).is_some());

// get_pending_justice_txs should also return it, since prev still holds
// the revoked commitment data (cloned, not consumed by signing).
let pending = {
let monitor = get_monitor!(nodes[1], channel_id);
monitor.get_pending_justice_txs(FEERATE_FLOOR_SATS_PER_KW as u64, destination_script)
};
assert!(!pending.is_empty());
assert!(pending[0].commitment_number < INITIAL_COMMITMENT_NUMBER);
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is named "crash_recovery" but doesn't actually test the crash recovery path — it never serializes, discards in-memory state, and deserializes the monitor. It just checks that get_pending_justice_txs returns results during normal (non-crash) operation, which is already covered by test_justice_tx_idempotent.

A meaningful crash recovery test would:

  1. Serialize the monitor after the revocation
  2. Deserialize it into a fresh ChannelMonitor
  3. Call get_pending_justice_txs on the deserialized monitor
  4. Verify the justice tx is present and valid

This would exercise the TLV 43/45 serialization round-trip and confirm that prev_counterparty_commitment_tx survives persistence — the actual invariant that crash recovery depends on.

@FreeOnlineUser FreeOnlineUser force-pushed the watchtower-justice-api branch from a07c56d to ee59374 Compare April 3, 2026 04:36
Comment on lines +3633 to +3641
if !self
.funding
.prev_counterparty_commitment_tx
.as_ref()
.is_some_and(|p| self.commitment_secrets.get_secret(p.commitment_number()).is_some())
{
self.funding.prev_counterparty_commitment_tx =
self.funding.cur_counterparty_commitment_tx.take();
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug (security-critical): Guard condition causes permanent loss of revoked commitments after the first channel state update.

This guard checks if prev already holds a revoked commitment (secret available) and, if so, refuses to rotate. The problem is this check is permanently sticky — once a secret is provided, get_secret() returns Some forever. After the first CommitmentSecret update revokes the commitment in prev, every subsequent LatestCounterpartyCommitment update will hit this guard, and the outgoing cur (which becomes revoked in the next CommitmentSecret update) is silently dropped.

Trace with 4 sequential updates (normal commitment_signed/revoke_and_ack flow):

Update Step prev after cur after Justice produced
A (commitment_signed) LatestCounterpartyCommitment(N-1) C(N) C(N-1)
B (revoke_and_ack) CommitmentSecret(N) C(N) [revoked] C(N-1) C(N) ✓
C (commitment_signed) LatestCounterpartyCommitment(N-2) C(N) [guard blocks] C(N-2) C(N) (dup)
D (revoke_and_ack) CommitmentSecret(N-1) C(N) [still locked] C(N-2) C(N) (dup)

At Update C, C(N-1) was in cur and is silently overwritten. At Update D, its secret arrives but C(N-1) is in neither prev nor cur. The watchtower never gets the justice tx for C(N-1).

This repeats for every subsequent commitment — only C(N) (the very first revoked commitment) is ever returned. All later revoked commitments are lost.

Fix: Remove the guard entirely. Without it, the rotation prev = cur.take(); cur = new; works correctly: each commitment spends exactly one update cycle in prev, and by the time its secret arrives (in the next CommitmentSecret update), it's still in prev.

The same bug exists at lines 3654-3658 for pending_funding scopes.

Comment on lines +4395 to +4396
funding.prev_counterparty_commitment_tx = funding.cur_counterparty_commitment_tx.take();
funding.cur_counterparty_commitment_tx = Some(commitment_tx);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Inconsistent indentation — line 4395 has an extra leading tab compared to line 4396. Both should be at the same indentation level within the if block.

Suggested change
funding.prev_counterparty_commitment_tx = funding.cur_counterparty_commitment_tx.take();
funding.cur_counterparty_commitment_tx = Some(commitment_tx);
funding.prev_counterparty_commitment_tx = funding.cur_counterparty_commitment_tx.take();
funding.cur_counterparty_commitment_tx = Some(commitment_tx);

@FreeOnlineUser FreeOnlineUser force-pushed the watchtower-justice-api branch 2 times, most recently from b689315 to cab9a18 Compare April 3, 2026 04:49
Comment on lines +1139 to +1140
assert!(!pending.is_empty());
assert!(!pending.is_empty());
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Lines 1139-1140 are identical — assert!(!pending.is_empty()) is duplicated. This looks like a copy-paste error. The second assertion should check something meaningful, e.g.:

Suggested change
assert!(!pending.is_empty());
assert!(!pending.is_empty());
assert!(!pending.is_empty());
assert_eq!(pending[0].revoked_commitment_txid, revoked_txid);

This would actually validate that get_pending_justice_txs returns the justice tx for the correct revoked commitment, rather than just checking non-emptiness twice.

commitment_tx_base_weight, second_stage_tx_fees_sat, COMMITMENT_TX_WEIGHT_PER_HTLC,
OFFERED_HTLC_SCRIPT_WEIGHT,
};
use crate::ln::channel::INITIAL_COMMITMENT_NUMBER;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused import: INITIAL_COMMITMENT_NUMBER is added here but never used in any of the new test functions. The only uses in this file (line ~6797) define a local const that shadows this import. This will produce a compiler warning.

Suggested change
use crate::ln::channel::INITIAL_COMMITMENT_NUMBER;
use crate::ln::channel::{

Comment on lines +3629 to +3634
// Safe to overwrite prev: CommitmentSecret and new counterparty commitments
// are always in separate ChannelMonitorUpdates, so the Persist callback has
// a chance to call get_pending_justice_txs between revocation and rotation.
self.funding.prev_counterparty_commitment_tx =
self.funding.cur_counterparty_commitment_tx.take();
self.funding.cur_counterparty_commitment_tx = Some(current_funding_commitment_tx.clone());
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug (security): The comment's claimed invariant is false. revoke_and_ack processing in channel.rs (lines 9211-9216, 9239-9244) batches CommitmentSecret + LatestCounterpartyCommitment/LatestCounterpartyCommitmentTXInfo into a single ChannelMonitorUpdate whenever the holding cell is freed or a new commitment is required.

This causes missed justice transactions. Trace:

Setup: prev = C(K), cur = C(K-1). Neither has been revoked yet.

Batched update: [CommitmentSecret(K), LatestCounterpartyCommitment(C(K-2))]

  1. CommitmentSecret(K) processed → secret for K stored
  2. LatestCounterpartyCommitment(C(K-2)) processed → unconditional rotation: prev = C(K-1), cur = C(K-2). C(K) is lost from both slots.
  3. Persist callback → get_pending_justice_txs(): prev = C(K-1) (no secret for K-1 yet), cur = C(K-2) (no secret) → no justice tx produced for C(K)
  4. C(K)'s secret exists but C(K) is in neither prev nor cur. Its justice tx is permanently lost.

This happens whenever the remote's revoke_and_ack triggers maybe_free_holding_cell_htlcs() or require_commitment — a common scenario during high-throughput payment forwarding.

The same issue affects the post-loop at lines 4376-4391 (the LatestCounterpartyCommitmentTXInfo path), which also unconditionally rotates prev.

Fix suggestion: Before overwriting prev, check if it holds a commitment whose secret is now available. If so, produce justice txs for it before rotating — or buffer the outgoing prev in a temporary list that get_pending_justice_txs also checks. Alternatively, the rotation logic could be moved to the CommitmentSecret step (where the revocation actually happens) rather than the commitment step (where the new commitment arrives).

@FreeOnlineUser FreeOnlineUser force-pushed the watchtower-justice-api branch 2 times, most recently from ef8805e to b8fc4f9 Compare April 3, 2026 05:57
Adds sign_initial_justice_txs(), sign_justice_txs_from_update(), and
get_pending_justice_txs() to ChannelMonitor, enabling Persist implementors
to obtain signed justice transactions for both to_local and HTLC outputs
without maintaining external state.

Storage uses cur/prev counterparty commitment fields on FundingScope,
matching the existing pattern and supporting splicing. The API is
crash-safe: commitment data is cloned rather than consumed, and
get_pending_justice_txs() allows recovery after restart.

Simplifies WatchtowerPersister in test_utils by removing manual queue
and signing logic.

Addresses feedback from lightningdevkit/ldk-node#813 and picks up
the intent of lightningdevkit#2552.
@FreeOnlineUser FreeOnlineUser force-pushed the watchtower-justice-api branch from b8fc4f9 to 92e35d2 Compare April 3, 2026 06:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants